{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 20-1 Space requirements for van Emde Boas trees\n",
    "\n",
    "> This problem explores the space requirements for van Emde Boas trees and suggests a way to modify the data structure to make its space requirement depend on the number $n$ of elements actually stored in the tree, rather than on the universe size $u$. For simplicity, assume that $\\sqrt{u}$ is always an integer.\n",
    "\n",
    "> __*a*__. Explain why the following recurrence characterizes the space requirement $P(u)$ of a van Emde Boas tree with universe size u:\n",
    "\n",
    "> $P(u) = (\\sqrt{u} + 1) P(\\sqrt{u}) + \\Theta(\\sqrt{u})$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\sqrt{u}$: number of clusters.\n",
    "\n",
    "$1$: number of summary.\n",
    "\n",
    "$P(\\sqrt{u})$: space of cluster/summary.\n",
    "\n",
    "$\\Theta(\\sqrt{u})$: pointers of clusters."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Prove that recurrence (20.5) has the solution $P(u) = O(u)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Suppose $P(u) \\le cu - d$,\n",
    "\n",
    "$$\n",
    "\\begin{array}{rlll}\n",
    "P(u) &=& (\\sqrt{u} + 1) P(\\sqrt{u}) + \\Theta(\\sqrt{u}) \\\\\n",
    "&\\le& (\\sqrt{u} + 1) \\cdot c \\cdot (\\sqrt{u} - d) + \\Theta(\\sqrt{u}) \\\\\n",
    "&=& cu + c(1-d)\\sqrt{u} - cd + \\Theta(\\sqrt{u}) & (d > 1)\\\\\n",
    "&\\le& cu\n",
    "\\end{array}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> In order to reduce the space requirements, let us define a __*reduced-space van Emde Boas tree*__, or __*RS-vEB tree*__, as a __*vEB tree*__ $V$ but with the following changes:\n",
    "\n",
    "> * The attribute $V.cluster$, rather than being stored as a simple array of pointers to vEB trees with universe size $\\sqrt{u}$, is a hash table (see Chapter 11) stored as a dynamic table (see Section 17.4). Corresponding to the array version of $V.cluster$, the hash table stores pointers to RS-vEB trees with universe size $\\sqrt{u}$. To find the $i$th cluster, we look up the key $i$ in the hash table, so that we can find the $i$th cluster by a single search in the hash table.\n",
    "\n",
    "> * The hash table stores only pointers to nonempty clusters. A search in the hash table for an empty cluster returns NIL, indicating that the cluster is empty.\n",
    "\n",
    "> * The attribute $V.summary$ is NIL if all clusters are empty. Otherwise, $V.summary$ points to an RS-vEB tree with universe size $\\sqrt{u}$.\n",
    "\n",
    "> Because the hash table is implemented with a dynamic table, the space it requires is proportional to the number of nonempty clusters.\n",
    "\n",
    "> When we need to insert an element into an empty RS-vEB tree, we create the RS-vEB tree by calling the following procedure, where the parameter u is the universe size of the RS-vEB tree:\n",
    "\n",
    "> ```\n",
    "CREATE-NEW-RS-VEB-TREE(u)\n",
    "1  allocate a new vEB tree V\n",
    "2  V.u = u\n",
    "3  V.min = NIL\n",
    "4  V.max = NIL\n",
    "5  V.summary = NIL\n",
    "6  create V.cluster as an empty dynamic hash table\n",
    "7  return V\n",
    "```\n",
    "\n",
    "> __*c*__. Modify the VEB-TREE-INSERT procedure to produce pseudocode for the procedure RS-VEB-TREE-INSERT$(V, x)$, which inserts $x$ into the RS-vEB tree $V$, calling CREATE-NEW-RS-VEB-TREE as appropriate."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "    def get_cluster(self, x):\n",
    "        if self.cluster[x] is None:\n",
    "            self.cluster[x] = VanEmdeBoasTree(self.sqrt_l)\n",
    "        return self.cluster[x]\n",
    "\n",
    "    def get_summary(self):\n",
    "        if self.summary is None:\n",
    "            self.summary = VanEmdeBoasTree(self.sqrt_h)\n",
    "        return self.summary\n",
    "        \n",
    "    def insert(self, x):\n",
    "        if self.min is None:\n",
    "            self.insert_empty(x)\n",
    "        else:\n",
    "            if x < self.min:\n",
    "                x, self.min = self.min, x\n",
    "            if not self.is_leaf():\n",
    "                if self.get_cluster(self.high(x)).minimum() is None:\n",
    "                    self.get_summary().insert(self.high(x))\n",
    "                    self.get_cluster(self.high(x)).insert_empty(self.low(x))\n",
    "                else:\n",
    "                    self.get_cluster(self.high(x)).insert(self.low(x))\n",
    "            if x > self.max:\n",
    "                self.max = x\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Modify the VEB-TREE-SUCCESSOR procedure to produce pseudocode for the procedure RS-VEB-TREE-SUCCESSOR$(V, x)$, which returns the successor of $x$ in RS-vEB tree $V$, or NIL if $x$ has no successor in $V$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "    def successor(self, x):\n",
    "        if self.is_leaf():\n",
    "            if x == 0 and self.max == 1:\n",
    "                return 1\n",
    "            return None\n",
    "        if self.min is not None and x < self.min:\n",
    "            return self.min\n",
    "        max_low = self.get_cluster(self.high(x)).maximum()\n",
    "        if max_low is not None and self.low(x) < max_low:\n",
    "            offset = self.get_cluster(self.high(x)).successor(self.low(x))\n",
    "            return self.index(self.high(x), offset)\n",
    "        succ_cluster = self.get_summary().successor(self.high(x))\n",
    "        if succ_cluster is None:\n",
    "            return None\n",
    "        offset = self.cluster[succ_cluster].minimum()\n",
    "        return self.index(succ_cluster, offset)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*e*__. Prove that, under the assumption of simple uniform hashing, your RS-VEBTREE-INSERT and RS-VEB-TREE-SUCCESSOR procedures run in $O(\\lg \\lg u)$ expected time."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hashing tasks about $\\Theta(1)$ time, thus the procedures run in $O(\\lg \\lg u)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*f*__. Assuming that elements are never deleted from a vEB tree, prove that the space requirement for the RS-vEB tree structure is $O(n)$, where $n$ is the number of elements actually stored in the RS-vEB tree."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*g*__. RS-vEB trees have another advantage over vEB trees: they require less time to create. How long does it take to create an empty RS-vEB tree?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\Theta(\\sqrt{u})$ to create the hash table."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 20-2 y-fast tries\n",
    "\n",
    "> This problem investigates D. Willard's \"y-fast tries\" which, like van Emde Boas trees, perform each of the operations MEMBER, MINIMUM, MAXIMUM, PREDECESSOR, and SUCCESSOR on elements drawn from a universe with size $u$ in $O(\\lg\\lg u)$ worst-case time. The INSERT and DELETE operations take $O(\\lg\\lg u)$ amortized time. Like reduced-space van Emde Boas trees (see Problem 20-1), yfast tries use only $O(n)$ space to store $n$ elements. The design of y-fast tries relies on perfect hashing (see Section 11.5).\n",
    "\n",
    "> As a preliminary structure, suppose that we create a perfect hash table containing not only every element in the dynamic set, but every prefix of the binary representation of every element in the set. For example, if $u = 16$, so that $\\lg u = 4$, and $x = 13$ is in the set, then because the binary representation of $13$ is $1101$, the perfect hash table would contain the strings $1$, $11$, $110$, and $1101$. In addition to the hash table, we create a doubly linked list of the elements currently in the set, in increasing order.\n",
    "\n",
    "> __*a*__. How much space does this structure require?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$O(n \\lg u)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*b*__. Show how to perform the MINIMUM and MAXIMUM operations in $O(1)$ time; the MEMBER, PREDECESSOR, and SUCCESSOR operations in $O(\\lg \\lg u)$ time; and the INSERT and DELETE operations in $O(\\lg u)$ time.\n",
    "\n",
    "> To reduce the space requirement to $O(n)$, we make the following changes to the data structure:\n",
    "\n",
    "> * We cluster the $n$ elements into $n / \\lg u$ groups of size $\\lg u$. (Assume for now that $\\lg u$ divides $n$.) The first group consists of the $\\lg u$ smallest elements in the set, the second group consists of the next $\\lg u$ smallest elements, and so on.\n",
    "\n",
    "> * We designate a \"representative\" value for each group. The representative of the $i$th group is at least as large as the largest element in the $i$th group, and it is smaller than every element of the $(i+1)$st group. (The representative of the last group can be the maximum possible element $u - 1$.) Note that a representative might be a value not currently in the set.\n",
    "\n",
    "> * We store the $\\lg u$ elements of each group in a balanced binary search tree, such as a red-black tree. Each representative points to the balanced binary search tree for its group, and each balanced binary search tree points to its group's representative.\n",
    "\n",
    "> * The perfect hash table stores only the representatives, which are also stored in a doubly linked list in increasing order.\n",
    "\n",
    "> We call this structure a __*y-fast trie*__.\n",
    "\n",
    "> __*c*__. Show that a y-fast trie requires only $O(n)$ space to store $n$ elements."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The doubly linked list has less than $n$ elements, while the binary search trees contains $n$ nodes, thus a y-fast trie requires $O(n)$ space.\n",
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*d*__. Show how to perform the MINIMUM and MAXIMUM operations in $O(\\lg \\lg u)$ time with a y-fast trie."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "MINIMUM: Find the minimum representative in the doubly linked list in $\\Theta(1)$, then find the minimum element in the binary search tree in $O(\\lg \\lg u)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*e*__. Show how to perform the MEMBER operation in $O(\\lg \\lg u)$ time."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find the smallest representative greater than $k$ with binary searching in $\\Theta(\\lg \\lg u)$, find the element in the binary search tree in $O(\\lg \\lg u)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*f*__. Show how to perform the PREDECESSOR and SUCCESSOR operations in\n",
    "$O(\\lg \\lg u)$ time."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "SUCCESSOR: Find the smallest representative greater than $k$ with binary searching in $\\Theta(\\lg \\lg u)$, then find whether there is an element in this cluster that is larger than $k$ in $O(\\lg \\lg u)$. If there is on element greater than $k$ in the representative's cluster, then the successor is in the next representative's cluster, we can locate the next representative with the doubly linked list in $\\Theta(1)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*g*__. Explain why the INSERT and DELETE operations take $\\Omega(\\lg \\lg u)$ time."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Same as __*e*__, we need to find the cluster in $\\Theta(\\lg \\lg u)$, then the operations in the binary search tree takes $O(\\lg \\lg u)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> __*h*__. Show how to relax the requirement that each group in a y-fast trie has exactly $\\lg u$ elements to allow INSERT and DELETE to run in $O(\\lg \\lg u)$ amortized time without affecting the asymptotic running times of the other operations."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fixed representatives."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
